Unsupervised Outlier Profile Analysis

نویسندگان

  • Debashis Ghosh
  • Song Li
چکیده

In much of the analysis of high-throughput genomic data, "interesting" genes have been selected based on assessment of differential expression between two groups or generalizations thereof. Most of the literature focuses on changes in mean expression or the entire distribution. In this article, we explore the use of C(α) tests, which have been applied in other genomic data settings. Their use for the outlier expression problem, in particular with continuous data, is problematic but nevertheless motivates new statistics that give an unsupervised analog to previously developed outlier profile analysis approaches. Some simulation studies are used to evaluate the proposal. A bivariate extension is described that can accommodate data from two platforms on matched samples. The proposed methods are applied to data from a prostate cancer study.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dominant Features Identification for Covert Nodes in 9/11 Attack Using Their Profile

In recent days terrorism poses a threat to homeland security. The major problem faced in network analysis is to automatically identify the key player who can maximally influence other nodes in a large relational covert network. The existing centrality based and graph theoretic approach are more concerned about the network structure rather than the node attributes. In this paper an unsupervised ...

متن کامل

A Taxonomy Framework for Unsupervised Outlier Detection Techniques for Multi-Type Data Sets

The term “outlier” can generally be defined as an observation that is significantly different from the other values in a data set. The outliers may be instances of error or indicate events. The task of outlier detection aims at identifying such outliers in order to improve the analysis of data and further discover interesting and useful knowledge about unusual events within numerous application...

متن کامل

Outlier Detection in Wireless Sensor Networks Using Distributed Principal Component Analysis

Detecting anomalies is an important challenge for intrusion detection and fault diagnosis in wireless sensor networks (WSNs). To address the problem of outlier detection in wireless sensor networks, in this paper we present a PCA-based centralized approach and a DPCA-based distributed energy-efficient approach for detecting outliers in sensed data in a WSN. The outliers in sensed data can be ca...

متن کامل

Penalized unsupervised learning with outliers.

We consider the problem of performing unsupervised learning in the presence of outliers - that is, observations that do not come from the same distribution as the rest of the data. It is known that in this setting, standard approaches for unsupervised learning can yield unsatisfactory results. For instance, in the presence of severe outliers, K-means clustering will often assign each outlier to...

متن کامل

Outlier Detection Using Unsupervised and Semi-Supervised Technique on High Dimensional Data

Outlier detection is useful for credit card fraud detection. Due to drastic increase in digital frauds, there is a lot of financial losses and therefore various techniques are developed for fraud detection and applied to diverse business fields. In high-dimensional data, outlier detection presents some challenges because of increment of dimensionality. In this paper, the proposed model aims to ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 13  شماره 

صفحات  -

تاریخ انتشار 2014